m900 tower port + TPM1 counter auth and defend lock fixes#2118
m900 tower port + TPM1 counter auth and defend lock fixes#2118tlaurion wants to merge 16 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
This PR ports the Lenovo M900 Tower (Skylake/Kaby Lake LGA1151 mini-tower) to Heads with two board variants (maximized, hotp-maximized), plus targeted TPM1 reliability fixes in tpmr.sh so that auth-failure detection and tpm1_reset() recover from the TPM_DEFEND_LOCK_RUNNING state after multiple bad passphrases. Documentation in doc/tpm.md is expanded with TPM1 vs TPM2 error-stream conventions, auth grep patterns, and the defend-lock recovery flow.
Changes:
- New
EOL_m900_tower-{maximized,hotp-maximized}boards with shared coreboot/linux configs and an ME blob preparation pipeline (download → me_cleaner → deguard). - TPM1 auth grep patterns extended to include
defend/0x98e/0x149;tpm1_reset()cycles physical presence ondefend lock runningafterforceclear. - CircleCI: two new build jobs (depend on the existing
EOL_t480-hotp-maximized25.09 seed).
Reviewed changes
Copilot reviewed 10 out of 14 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| boards/EOL_m900_tower-maximized/EOL_m900_tower-maximized.config | New maximized board config (no HOTP). |
| boards/EOL_m900_tower-hotp-maximized/EOL_m900_tower-hotp-maximized.config | New hotp-maximized variant. |
| config/coreboot-m900-maximized.config | Coreboot 25.09 config for Lenovo M900. |
| config/linux-m900.config | Linux 6.1.8 kernel config for the board. |
| targets/m900_me_blobs.mk | Make rules tying the ME blob script into the board build. |
| blobs/m900/m900_download_clean_deguard_me.sh | Downloads ASRock BIOS, neuters/deguards ME 11.6.0.1126. |
| blobs/m900/README.md | Blob layout, sources and integrity notes. |
| blobs/m900/hashes.txt | SHA256 of ME/IFD/GBE blobs. |
| blobs/m900/.gitignore | Ignores generated me.bin/m900_me.bin. |
| initrd/bin/tpmr.sh | Adds defend-lock detection in auth-retry grep and tpm1_reset() recovery sequence. |
| doc/tpm.md | New sections on TPM1/TPM2 auth error patterns and defend-lock recovery. |
| .circleci/config.yml | Adds the two M900 board build jobs (depending on the 25.09 seed). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
78366db to
a9cd1fc
Compare
a9cd1fc to
d8d7665
Compare
863c9b7 to
6919d44
Compare
|
the fix did not help. |
PR #2068 (tpm_reseal_ux-integrity_report-detect_disk_and_tpm_swap, merged at d3d8053) changed increment_tpm_counter from hardcoded -pwdc '' (empty counter auth) to -pwdc "${tpm_passphrase:-}" (owner passphrase from cache/prompt), but left check_tpm_counter using empty -pwdc when called from kexec-sign-config.sh without a $3 passphrase argument. This caused every counter increment to compute SHA1(owner_pass) while the counter was created with SHA1("") - persistent TPM_AUTH_FAIL. Per TCG TPM Main Spec Part 3, TPM_CreateCounter uses owner auth (-pwdo) but TPM_IncrementCounter uses the counter's own authData, not the owner password. The correct design for Heads' rollback counter is empty auth: rollback security comes from the signed /boot/kexec_rollback.txt and TPM sealing, not counter access control. The repeated auth failures (3 per boot x ~5 boots via the _tpm_auth_retry loop) triggered TPM 1.2 dictionary-attack lockout (TPM_DEFEND_LOCK_RUNNING), which persisted through forceclear on some implementations, causing tpm takeown to fail and TPM reset to abort - a cascade failure from the counter auth mismatch. Changes: - initrd/bin/tpmr.sh (_tpm_auth_retry, tpm2_counter_inc, tpm2_seal, tpm1_seal): add 'defend' and '0x98e|0x149' to auth detection grep patterns so defend lock and TPM2 RC codes are treated as retryable auth failures rather than fatal errors - initrd/bin/tpmr.sh (tpm1_reset): detect defend lock after takeown failure and cycle physical presence to clear the lock state before retrying; full AC power cycle remains the fallback if software presence is insufficient - initrd/bin/tpmr.sh (tpm1_counter_increment): detect -pwdc '' and call tpm directly, bypassing _tpm_auth_retry which injected the owner passphrase. Use || return to survive set -e on expected auth failure. - initrd/etc/functions.sh (check_tpm_counter): pass -pwdc '' instead of -pwdc "${tpm_passphrase:-}" so counters use SHA1("") per TCG spec. Document that $3 is intentionally ignored. - initrd/etc/functions.sh (increment_tpm_counter): try -pwdc '' first for TPM1. If that fails on a readable counter (created by PR #2068 era code), prompt for owner passphrase and retry as migration fallback with clear WARN explaining the one-time migration and TPM reset option. - initrd/etc/functions.sh (increment_tpm_counter): remove the TPM1-specific owner-passphrase prompt block added by PR #2068 - initrd/etc/functions.sh (increment_tpm_counter): DIE-path fallback counter_create: -pwdc '' for consistency - initrd/bin/oem-factory-reset.sh: counter_create -pwdc '' for consistency with the empty-auth design - doc/tpm.md: document TPM1 boot chain, tpmtotp tool selection, auth retry patterns, defend lock recovery, and physical presence Signed-off-by: Thierry Laurion <insurgo@riseup.net>
6919d44 to
9d1e115
Compare
went down the rabbit hole and found the regression and created fix pushed onto #2117 and rebased on top of it. |
31e8838 to
3e31d39
Compare
Signed-off-by: notgivenby <notgivenby@gmail.com> Signed-off-by: Thierry Laurion <insurgo@riseup.net>
Signed-off-by: notgivenby <notgivenby@gmail.com> Signed-off-by: Thierry Laurion <insurgo@riseup.net>
Signed-off-by: notgivenby <notgivenby@gmail.com> Signed-off-by: Thierry Laurion <insurgo@riseup.net>
Signed-off-by: notgivenby <notgivenby@gmail.com> Signed-off-by: Thierry Laurion <insurgo@riseup.net>
Signed-off-by: notgivenby <notgivenby@gmail.com> Signed-off-by: Thierry Laurion <insurgo@riseup.net>
Signed-off-by: notgivenby <notgivenby@gmail.com> Signed-off-by: Thierry Laurion <insurgo@riseup.net>
Signed-off-by: notgivenby <notgivenby@gmail.com> Signed-off-by: Thierry Laurion <insurgo@riseup.net>
Signed-off-by: notgivenby <notgivenby@gmail.com> Signed-off-by: Thierry Laurion <insurgo@riseup.net>
…900_tower board Signed-off-by: notgivenby <notgivenby@gmail.com> Signed-off-by: Thierry Laurion <insurgo@riseup.net>
Signed-off-by: notgivenby <notgivenby@gmail.com> Signed-off-by: Thierry Laurion <insurgo@riseup.net>
Signed-off-by: notgivenby <notgivenby@gmail.com> Signed-off-by: Thierry Laurion <insurgo@riseup.net>
- blobs/m900/README.md: fix blob filenames, spelling (paritally->partially, Unfourtunatly->Unfortunately, layot->layout) - blobs/m900/m900_download_clean_deguard_me.sh: fix Dell->ASRock comment - boards/EOL_m900_tower-*: fix m900_tiny->m900_tower, fix ME script path, add 'tower' to CONFIG_BOARD_NAME - targets/m900_me_blobs.mk: rewrite header with accurate instructions Signed-off-by: Thierry Laurion <insurgo@riseup.net>
Signed-off-by: Thierry Laurion <insurgo@riseup.net>
3e31d39 to
22ca620
Compare
m900 ME blob download/deguard script was not wired into the x86_blobs CI job. Add it after the xx80 steps, following the same pattern as other board families. Signed-off-by: Thierry Laurion <insurgo@riseup.net>
8838b18 to
3ef9c50
Compare
|
@notgivenby hopefully this works. Added m900 blobs download in circleci so it downloads it once and reuses cache if already there and checksums match, reworked the script to reuse lib added, and fixups for tpm1. Keep me posted. Hopefully #2117 (we are based on it here) fixes your issue. |
|
You invested really a lot of time into debugging...I do not want to misuse you and your time here. Please let me known if @tlaurion you think we need to stop the attempts to port let us say not most popular board for only few people. The issue still maybe releveant for other ports who knows. For future attemps, perhaps I need to select a desktop board from other vendor perhaps with tpm2? |
@notgivenby this has nothing specific to m900, outside of the fact that as opposed to tpm2, we do not configure the dictionary attack config (how many attempt per timeframe nor resolution period). As said under #2117 , master contains a regression for how we do counter create and how we increment the counter. Your platform has DA lockout, might be 24h before automatic resolution. i've worked a bit on it today but haven't finished adding code to troubleshot this properly. Maybe tomorrow, most probably monday |
|
@tlaurion the problem is resolved now! |
896a49e to
ea72a86
Compare
…tool PR #2118 adds the m900 tower board port, refactors blob scripts to use shared blobs/lib.sh, fixes TPM1 counter auth regression (PR #2068, restores empty counter auth per TCG spec), adds CI integration, and introduces TPM dictionary attack diagnostics and protection. This commit covers the TPM DA portion: TPM dictionary attack (DA) lockout is a standard TPM 1.2 and 2.0 feature that blocks authorization after repeated auth failures to thwart brute-force attacks. Until now Heads had no way to query DA state from the recovery shell, no protection against accidental DA accumulation during counter increments, and no diagnostic tool to test DA lockout behavior. The PR #2068 regression (3 TPM_AUTH_FAIL per boot, 5 boots to lockout) exposed this gap. Three features across both TPM versions: 1. da_state (tpmr.sh da_state) Query DA state from the recovery shell. TPM1 uses tpm getcapability TPM_CAP_DA_LOGIC (0x19) for TPM_DA_INFO/TPM_DA_INFO_LIMITED. TPM2 uses tpm2 getcap properties-variable for LOCKOUT_COUNTER, MAX_AUTH_FAIL, LOCKOUT_INTERVAL, LOCKOUT_RECOVERY. Both output raw TPM data, human-readable policy with TCG terminology (maxTries, recoveryTime, failedTries), formatted lockout timer (hours/min/sec), and a DA: machine line for script consumption. 2. Preflight guard (increment_tpm_counter) DA: line timer > 0 means actively locked (DIE with remaining time). TPM1: timer=0 means state inactive, count may be above threshold (WARN, proceed -- correct empty-auth increment resets the counter). TPM2: count >= threshold sets timer to estimated unlock seconds. Count >= threshold-1 without lockout means nearing threshold (WARN). Runs for both TPM versions under CONFIG_TPM=y. 3. bad_auth (tpmr.sh bad_auth) Diagnostic tool: reads counter from /boot/kexec_rollback.txt, verifies existence, increments with deliberately wrong password. TPM1: -pwdc <wrong>. TPM2: -P <wrong> (NV index auth produces TPM2_RC_AUTH_FAIL 0x98e, increments LOCKOUT_COUNTER). Shows DA state before/after each attempt. Also: - tpm1_reset: simplified defend lock recovery (single PP cycle, then check da_state or tpm-reset.sh to clear TPM). Removed ineffective second forceclear, ResetLockValue, sleep+retry. No AC power cycle guidance (behavior is platform-specific, not defined by TCG spec). - tpm-reset.sh: TODO to consolidate with gui-init.sh reset_tpm() - doc/tpm.md: document all features, escalation table, TPM reset methods, DA parameter configurability (TPM2 configurable; TPM1 firmware-set, no software-accessible command) Signed-off-by: Thierry Laurion <insurgo@riseup.net>
b8d3fc0 to
148688e
Compare
Fix da_timer sed pattern: use sed -n with /p flag so that when the DA: line has no timer= field (TPM2 count < threshold), da_timer stays empty instead of getting the full line string. Improve inline documentation in tpm1_da_state, tpm2_da_state, tpm1_bad_auth, and tpm2_bad_auth to explain purpose, TPM version differences, and callers. Signed-off-by: Thierry Laurion <insurgo@riseup.net>
34f7960 to
d76596a
Compare
Supersedes #2111, rebased on #2117 (tpm1_fixes).
This PR merges the m900 tower board port with the TPM1 fixes from PR #2117.
Board port (notgivenby's commits):
TPM1 fixes from PR #2117 (included via rebase on tpm1_fixes):
@notgivenby so you can test artifacts.